{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.680788500Z",
     "start_time": "2023-06-11T11:12:29.865570700Z"
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from sklearn import utils"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "### 数据数据清洗&数据探索和可视化\n",
    "#### 数据列与解释\n",
    "|数据项|数据类型| 数据描述   |\n",
    "|---|----|--------|\n",
    "|id|int| 数据项id值 |\n",
    "|satisfaction_v2|String| 满意度调查 |\n",
    "|Gender|String| 性别 |\n",
    "|Customer Type|String| 客户类型 |\n",
    "|Age|int| 年龄 |\n",
    "|Type of Travel|String| 旅行类型 |\n",
    "|Class|String| 座位级别（商务舱/经济舱） |\n",
    "|Light Distance|int| 飞行距离 |\n",
    "|Seat comfort|int| 座位舒适度 |\n",
    "|Departure/Arrival time convenient|int| 出发/到达时间的满意度 |\n",
    "|Food and drink|int| 食品与饮料满意度 |\n",
    "|Gate location|int| 进出口位置（方便度） |\n",
    "|Inflight wifi service|int| 飞机WIFI服务 |\n",
    "|Inflight entertainment|int| 飞机上娱乐 |\n",
    "|Online support|int| 在线支持 |\n",
    "|Ease of Online booking|int| 网上预订的方便度 |\n",
    "|On-board service|int| 登机服务满意度 |\n",
    "|Leg room service|int| 休息室服务满意度 |\n",
    "|Baggage handling|int| 随身行李满意度 |\n",
    "|Checkin service|int| 检票服务满意度 |\n",
    "|Cleanliness|int| 飞机卫生满意度 |\n",
    "|Online boarding|int| 线上登机满意度 |\n",
    "|Departure Delay in Minutes|int| 晚点起飞分钟数 |\n",
    "|Arrival Delay in Minutes|int| 晚点降落的分钟数 |"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "outputs": [
    {
     "data": {
      "text/plain": "                satisfaction_v2  Gender      Customer Type  Age  \\\nid                                                                \n1       neutral or dissatisfied    Male  disloyal Customer   48   \n2                     satisfied  Female     Loyal Customer   35   \n3                     satisfied    Male     Loyal Customer   41   \n4                     satisfied    Male     Loyal Customer   50   \n5                     satisfied  Female     Loyal Customer   49   \n...                         ...     ...                ...  ...   \n129876  neutral or dissatisfied    Male     Loyal Customer   28   \n129877  neutral or dissatisfied    Male     Loyal Customer   41   \n129878  neutral or dissatisfied    Male     Loyal Customer   42   \n129879                satisfied    Male     Loyal Customer   50   \n129880                satisfied  Female     Loyal Customer   20   \n\n         Type of Travel     Class  Flight Distance  Seat comfort  \\\nid                                                                 \n1       Business travel  Business             2211             3   \n2       Business travel  Business              840             2   \n3       Business travel  Business              879             4   \n4       Business travel  Business             1932             2   \n5       Business travel  Business             3512             3   \n...                 ...       ...              ...           ...   \n129876  Personal Travel  Eco Plus             1916             4   \n129877  Personal Travel  Eco Plus             1483             3   \n129878  Personal Travel  Eco Plus             1685             2   \n129879  Personal Travel  Eco Plus             1229             5   \n129880  Personal Travel  Eco Plus             1472             3   \n\n        Departure/Arrival time convenient  Food and drink  ...  \\\nid                                                         ...   \n1                                       3               3  ...   \n2                                       2               2  ...   \n3                                       4               4  ...   \n4                                       2               2  ...   \n5                                       3               3  ...   \n...                                   ...             ...  ...   \n129876                                  4               4  ...   \n129877                                  5               3  ...   \n129878                                  5               2  ...   \n129879                                  4               4  ...   \n129880                                  3               3  ...   \n\n        Online support  Ease of Online booking  On-board service  \\\nid                                                                 \n1                    5                       5                 3   \n2                    4                       5                 5   \n3                    5                       3                 3   \n4                    5                       5                 5   \n5                    4                       3                 3   \n...                ...                     ...               ...   \n129876               1                       4                 5   \n129877               2                       2                 5   \n129878               3                       3                 3   \n129879               4                       3                 4   \n129880               5                       4                 4   \n\n        Leg room service  Baggage handling  Checkin service  Cleanliness  \\\nid                                                                         \n1                      2                 5                4            5   \n2                      5                 5                3            5   \n3                      3                 3                4            3   \n4                      5                 5                3            5   \n5                      4                 3                3            3   \n...                  ...               ...              ...          ...   \n129876                 4                 4                4            5   \n129877                 5                 5                5            4   \n129878                 4                 5                4            4   \n129879                 5                 5                3            4   \n129880                 5                 4                5            4   \n\n        Online boarding  Departure Delay in Minutes  Arrival Delay in Minutes  \nid                                                                             \n1                     5                           2                       5.0  \n2                     5                          26                      39.0  \n3                     5                           0                       0.0  \n4                     4                           0                       0.0  \n5                     5                           0                       1.0  \n...                 ...                         ...                       ...  \n129876                4                           2                       3.0  \n129877                2                           0                       0.0  \n129878                3                           6                      14.0  \n129879                3                          31                      22.0  \n129880                3                           0                       0.0  \n\n[129880 rows x 23 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>satisfaction_v2</th>\n      <th>Gender</th>\n      <th>Customer Type</th>\n      <th>Age</th>\n      <th>Type of Travel</th>\n      <th>Class</th>\n      <th>Flight Distance</th>\n      <th>Seat comfort</th>\n      <th>Departure/Arrival time convenient</th>\n      <th>Food and drink</th>\n      <th>...</th>\n      <th>Online support</th>\n      <th>Ease of Online booking</th>\n      <th>On-board service</th>\n      <th>Leg room service</th>\n      <th>Baggage handling</th>\n      <th>Checkin service</th>\n      <th>Cleanliness</th>\n      <th>Online boarding</th>\n      <th>Departure Delay in Minutes</th>\n      <th>Arrival Delay in Minutes</th>\n    </tr>\n    <tr>\n      <th>id</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>1</th>\n      <td>neutral or dissatisfied</td>\n      <td>Male</td>\n      <td>disloyal Customer</td>\n      <td>48</td>\n      <td>Business travel</td>\n      <td>Business</td>\n      <td>2211</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>2</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>2</td>\n      <td>5.0</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>satisfied</td>\n      <td>Female</td>\n      <td>Loyal Customer</td>\n      <td>35</td>\n      <td>Business travel</td>\n      <td>Business</td>\n      <td>840</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>5</td>\n      <td>26</td>\n      <td>39.0</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>satisfied</td>\n      <td>Male</td>\n      <td>Loyal Customer</td>\n      <td>41</td>\n      <td>Business travel</td>\n      <td>Business</td>\n      <td>879</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>5</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>satisfied</td>\n      <td>Male</td>\n      <td>Loyal Customer</td>\n      <td>50</td>\n      <td>Business travel</td>\n      <td>Business</td>\n      <td>1932</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>4</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>satisfied</td>\n      <td>Female</td>\n      <td>Loyal Customer</td>\n      <td>49</td>\n      <td>Business travel</td>\n      <td>Business</td>\n      <td>3512</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>1.0</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>129876</th>\n      <td>neutral or dissatisfied</td>\n      <td>Male</td>\n      <td>Loyal Customer</td>\n      <td>28</td>\n      <td>Personal Travel</td>\n      <td>Eco Plus</td>\n      <td>1916</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>1</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>2</td>\n      <td>3.0</td>\n    </tr>\n    <tr>\n      <th>129877</th>\n      <td>neutral or dissatisfied</td>\n      <td>Male</td>\n      <td>Loyal Customer</td>\n      <td>41</td>\n      <td>Personal Travel</td>\n      <td>Eco Plus</td>\n      <td>1483</td>\n      <td>3</td>\n      <td>5</td>\n      <td>3</td>\n      <td>...</td>\n      <td>2</td>\n      <td>2</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>2</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>129878</th>\n      <td>neutral or dissatisfied</td>\n      <td>Male</td>\n      <td>Loyal Customer</td>\n      <td>42</td>\n      <td>Personal Travel</td>\n      <td>Eco Plus</td>\n      <td>1685</td>\n      <td>2</td>\n      <td>5</td>\n      <td>2</td>\n      <td>...</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>3</td>\n      <td>6</td>\n      <td>14.0</td>\n    </tr>\n    <tr>\n      <th>129879</th>\n      <td>satisfied</td>\n      <td>Male</td>\n      <td>Loyal Customer</td>\n      <td>50</td>\n      <td>Personal Travel</td>\n      <td>Eco Plus</td>\n      <td>1229</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>31</td>\n      <td>22.0</td>\n    </tr>\n    <tr>\n      <th>129880</th>\n      <td>satisfied</td>\n      <td>Female</td>\n      <td>Loyal Customer</td>\n      <td>20</td>\n      <td>Personal Travel</td>\n      <td>Eco Plus</td>\n      <td>1472</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>3</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n  </tbody>\n</table>\n<p>129880 rows × 23 columns</p>\n</div>"
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data = pd.read_csv(\"../dataset/satisfaction_v2.csv\",index_col=0)\n",
    "data = data.sort_index()\n",
    "data"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.691758400Z",
     "start_time": "2023-06-11T11:12:30.120887100Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "outputs": [],
   "source": [
    "data.isnull().sum()  #Arrival Delay in Minutes有393个缺失值，直接删除\n",
    "data=data[data[\"Arrival Delay in Minutes\"].notnull()]"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.691758400Z",
     "start_time": "2023-06-11T11:12:30.453997600Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\花于陌上开\\AppData\\Local\\Temp\\ipykernel_53384\\3126058378.py:3: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  data[\"satisfaction_v2\"] = data[\"satisfaction_v2\"].map(lambda x: d[x])\n",
      "C:\\Users\\花于陌上开\\AppData\\Local\\Temp\\ipykernel_53384\\3126058378.py:5: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  data[\"Gender\"] = data[\"Gender\"].map(lambda x: d[x])\n",
      "C:\\Users\\花于陌上开\\AppData\\Local\\Temp\\ipykernel_53384\\3126058378.py:7: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  data[\"Customer Type\"] = data[\"Customer Type\"].map(lambda x: d[x])\n",
      "C:\\Users\\花于陌上开\\AppData\\Local\\Temp\\ipykernel_53384\\3126058378.py:9: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  data[\"Type of Travel\"] = data[\"Type of Travel\"].map(lambda x: d[x])\n",
      "C:\\Users\\花于陌上开\\AppData\\Local\\Temp\\ipykernel_53384\\3126058378.py:11: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  data[\"Class\"] = data[\"Class\"].map(lambda x: d[x])\n"
     ]
    },
    {
     "data": {
      "text/plain": "        satisfaction_v2  Gender  Customer Type  Age  Type of Travel  Class  \\\nid                                                                           \n1                     0       0              0   48               0      0   \n2                     1       1              1   35               0      0   \n3                     1       0              1   41               0      0   \n4                     1       0              1   50               0      0   \n5                     1       1              1   49               0      0   \n...                 ...     ...            ...  ...             ...    ...   \n129876                0       0              1   28               1      2   \n129877                0       0              1   41               1      2   \n129878                0       0              1   42               1      2   \n129879                1       0              1   50               1      2   \n129880                1       1              1   20               1      2   \n\n        Flight Distance  Seat comfort  Departure/Arrival time convenient  \\\nid                                                                         \n1                  2211             3                                  3   \n2                   840             2                                  2   \n3                   879             4                                  4   \n4                  1932             2                                  2   \n5                  3512             3                                  3   \n...                 ...           ...                                ...   \n129876             1916             4                                  4   \n129877             1483             3                                  5   \n129878             1685             2                                  5   \n129879             1229             5                                  4   \n129880             1472             3                                  3   \n\n        Food and drink  ...  Online support  Ease of Online booking  \\\nid                      ...                                           \n1                    3  ...               5                       5   \n2                    2  ...               4                       5   \n3                    4  ...               5                       3   \n4                    2  ...               5                       5   \n5                    3  ...               4                       3   \n...                ...  ...             ...                     ...   \n129876               4  ...               1                       4   \n129877               3  ...               2                       2   \n129878               2  ...               3                       3   \n129879               4  ...               4                       3   \n129880               3  ...               5                       4   \n\n        On-board service  Leg room service  Baggage handling  Checkin service  \\\nid                                                                              \n1                      3                 2                 5                4   \n2                      5                 5                 5                3   \n3                      3                 3                 3                4   \n4                      5                 5                 5                3   \n5                      3                 4                 3                3   \n...                  ...               ...               ...              ...   \n129876                 5                 4                 4                4   \n129877                 5                 5                 5                5   \n129878                 3                 4                 5                4   \n129879                 4                 5                 5                3   \n129880                 4                 5                 4                5   \n\n        Cleanliness  Online boarding  Departure Delay in Minutes  \\\nid                                                                 \n1                 5                5                           2   \n2                 5                5                          26   \n3                 3                5                           0   \n4                 5                4                           0   \n5                 3                5                           0   \n...             ...              ...                         ...   \n129876            5                4                           2   \n129877            4                2                           0   \n129878            4                3                           6   \n129879            4                3                          31   \n129880            4                3                           0   \n\n        Arrival Delay in Minutes  \nid                                \n1                            5.0  \n2                           39.0  \n3                            0.0  \n4                            0.0  \n5                            1.0  \n...                          ...  \n129876                       3.0  \n129877                       0.0  \n129878                      14.0  \n129879                      22.0  \n129880                       0.0  \n\n[129487 rows x 23 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>satisfaction_v2</th>\n      <th>Gender</th>\n      <th>Customer Type</th>\n      <th>Age</th>\n      <th>Type of Travel</th>\n      <th>Class</th>\n      <th>Flight Distance</th>\n      <th>Seat comfort</th>\n      <th>Departure/Arrival time convenient</th>\n      <th>Food and drink</th>\n      <th>...</th>\n      <th>Online support</th>\n      <th>Ease of Online booking</th>\n      <th>On-board service</th>\n      <th>Leg room service</th>\n      <th>Baggage handling</th>\n      <th>Checkin service</th>\n      <th>Cleanliness</th>\n      <th>Online boarding</th>\n      <th>Departure Delay in Minutes</th>\n      <th>Arrival Delay in Minutes</th>\n    </tr>\n    <tr>\n      <th>id</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>1</th>\n      <td>0</td>\n      <td>0</td>\n      <td>0</td>\n      <td>48</td>\n      <td>0</td>\n      <td>0</td>\n      <td>2211</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>2</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>2</td>\n      <td>5.0</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>35</td>\n      <td>0</td>\n      <td>0</td>\n      <td>840</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>5</td>\n      <td>26</td>\n      <td>39.0</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>41</td>\n      <td>0</td>\n      <td>0</td>\n      <td>879</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>5</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>50</td>\n      <td>0</td>\n      <td>0</td>\n      <td>1932</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>4</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>49</td>\n      <td>0</td>\n      <td>0</td>\n      <td>3512</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>1.0</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>129876</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>28</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1916</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>1</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>2</td>\n      <td>3.0</td>\n    </tr>\n    <tr>\n      <th>129877</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>41</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1483</td>\n      <td>3</td>\n      <td>5</td>\n      <td>3</td>\n      <td>...</td>\n      <td>2</td>\n      <td>2</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>2</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>129878</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>42</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1685</td>\n      <td>2</td>\n      <td>5</td>\n      <td>2</td>\n      <td>...</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>3</td>\n      <td>6</td>\n      <td>14.0</td>\n    </tr>\n    <tr>\n      <th>129879</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>50</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1229</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>31</td>\n      <td>22.0</td>\n    </tr>\n    <tr>\n      <th>129880</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>20</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1472</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>3</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n  </tbody>\n</table>\n<p>129487 rows × 23 columns</p>\n</div>"
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#独热编码\n",
    "d = {j: i for i, j in enumerate(data[\"satisfaction_v2\"].unique())}\n",
    "data[\"satisfaction_v2\"] = data[\"satisfaction_v2\"].map(lambda x: d[x])\n",
    "d = {j: i for i, j in enumerate(data[\"Gender\"].unique())}\n",
    "data[\"Gender\"] = data[\"Gender\"].map(lambda x: d[x])\n",
    "d = {j: i for i, j in enumerate(data[\"Customer Type\"].unique())}\n",
    "data[\"Customer Type\"] = data[\"Customer Type\"].map(lambda x: d[x])\n",
    "d = {j: i for i, j in enumerate(data[\"Type of Travel\"].unique())}\n",
    "data[\"Type of Travel\"] = data[\"Type of Travel\"].map(lambda x: d[x])\n",
    "d = {j: i for i, j in enumerate(data[\"Class\"].unique())}\n",
    "data[\"Class\"] = data[\"Class\"].map(lambda x: d[x])\n",
    "data"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.692757500Z",
     "start_time": "2023-06-11T11:12:30.500817Z"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "#### 数据列与解释(数据预处理)\n",
    "|数据项| 数据类型       | 数据描述                          |\n",
    "|---|------------|-------------------------------|\n",
    "|id| int        | 数据项id值                        |\n",
    "|satisfaction_v2| int(0/1)   | 不满意(0)/满意(1)                  |\n",
    "|Gender| int(0/1)   | 性别:男(0)/女(1)                  |\n",
    "|Customer Type| int(0/1)   | 客户类型:不忠实用户(0)/忠实用户(1)         |\n",
    "|Age| int        | 年龄                            |\n",
    "|Type of Travel| int(0/1)   | 旅行类型:商务(0)/个人(1)              |\n",
    "|Class| int(0/1/2) | 座位级别:商务舱(0)/经济舱(1)/经济舱plus(2) |\n",
    "|Light Distance| int        | 飞行距离                          |\n",
    "|Seat comfort| int        | 座位舒适度                         |\n",
    "|Departure/Arrival time convenient| int        | 出发/到达时间的满意度                   |\n",
    "|Food and drink| int        | 食品与饮料满意度                      |\n",
    "|Gate location| int        | 进出口位置（方便度）                    |\n",
    "|Inflight wifi service| int        | 飞机WIFI服务                      |\n",
    "|Inflight entertainment| int        | 飞机上娱乐                         |\n",
    "|Online support| int        | 在线支持                          |\n",
    "|Ease of Online booking| int        | 网上预订的方便度                      |\n",
    "|On-board service| int        | 登机服务满意度                       |\n",
    "|Leg room service| int        | 休息室服务满意度                      |\n",
    "|Baggage handling| int        | 随身行李满意度                       |\n",
    "|Checkin service| int        | 检票服务满意度                       |\n",
    "|Cleanliness| int        | 飞机卫生满意度                       |\n",
    "|Online boarding| int        | 线上登机满意度                       |\n",
    "|Departure Delay in Minutes| int        | 晚点起飞分钟数                       |\n",
    "|Arrival Delay in Minutes| int        | 晚点降落的分钟数                      |"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "outputs": [],
   "source": [
    "def split_data(data: pd.DataFrame, train_ratio=0.8, test_ratio=0.1, shuffle=False, k_fold: int = None):\n",
    "    \"\"\"\n",
    "    分离数据集\n",
    "    默认：不打乱顺序(可选打乱顺序)，训练集80%，测试集10%，验证集10%\n",
    "    可选方式：K折交叉验证，训练集与测试集按K折分离\n",
    "    :param data: 数据集\n",
    "    :param train_ratio: 训练集比例\n",
    "    :param test_ratio: 测试集比例(测试集+验证集<1.0)，验证集比例=1-(训练集比例+测试集比例)\n",
    "    :param shuffle: 是否打乱顺序\n",
    "    :param k_fold: 默认为None，值为交叉验证K值\n",
    "    :return: train_data,test_data,dev_data|train_data_list,test_data_list\n",
    "    \"\"\"\n",
    "    length = len(data)\n",
    "    if shuffle:\n",
    "        data = utils.shuffle(data)\n",
    "    if k_fold == None:\n",
    "        train_data = data.iloc[:int(length * train_ratio), :]\n",
    "        test_data = data.iloc[int(length * train_ratio):int(length * (train_ratio + test_ratio)), :]\n",
    "        dev_data = data.iloc[int(length * (train_ratio + test_ratio)):, :]\n",
    "        return train_data, test_data, dev_data\n",
    "    else:\n",
    "        train_data_list = []\n",
    "        test_data_list = []\n",
    "        for k in range(k_fold):\n",
    "            train1 = data.iloc[0:int(length / k_fold * k), :]\n",
    "            train2 = data.iloc[int(length / k_fold * (k + 1)):len(data), :]\n",
    "            train_data = pd.concat([train1, train2])\n",
    "            test_data = data.iloc[int(length / k_fold * k):int(length / k_fold * (k + 1)), :]\n",
    "            train_data_list.append(train_data)\n",
    "            test_data_list.append(test_data)\n",
    "        return train_data_list, test_data_list"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.692757500Z",
     "start_time": "2023-06-11T11:12:30.816971200Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "outputs": [],
   "source": [
    "train_data, test_data, dev_data = split_data(data)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.692757500Z",
     "start_time": "2023-06-11T11:12:30.829937700Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "outputs": [
    {
     "data": {
      "text/plain": "        satisfaction_v2  Gender  Customer Type  Age  Type of Travel  Class  \\\nid                                                                           \n1                     0       0              0   48               0      0   \n2                     1       1              1   35               0      0   \n3                     1       0              1   41               0      0   \n4                     1       0              1   50               0      0   \n5                     1       1              1   49               0      0   \n...                 ...     ...            ...  ...             ...    ...   \n103909                1       1              1   51               0      0   \n103910                1       0              1   54               1      0   \n103911                1       0              1   52               0      0   \n103912                1       0              1   51               0      0   \n103913                1       0              1   53               0      0   \n\n        Flight Distance  Seat comfort  Departure/Arrival time convenient  \\\nid                                                                         \n1                  2211             3                                  3   \n2                   840             2                                  2   \n3                   879             4                                  4   \n4                  1932             2                                  2   \n5                  3512             3                                  3   \n...                 ...           ...                                ...   \n103909               63             0                                  5   \n103910              540             5                                  4   \n103911             2328             4                                  4   \n103912              537             3                                  3   \n103913              804             2                                  2   \n\n        Food and drink  ...  Online support  Ease of Online booking  \\\nid                      ...                                           \n1                    3  ...               5                       5   \n2                    2  ...               4                       5   \n3                    4  ...               5                       3   \n4                    2  ...               5                       5   \n5                    3  ...               4                       3   \n...                ...  ...             ...                     ...   \n103909               0  ...               5                       5   \n103910               5  ...               5                       1   \n103911               4  ...               4                       4   \n103912               3  ...               4                       4   \n103913               2  ...               4                       5   \n\n        On-board service  Leg room service  Baggage handling  Checkin service  \\\nid                                                                              \n1                      3                 2                 5                4   \n2                      5                 5                 5                3   \n3                      3                 3                 3                4   \n4                      5                 5                 5                3   \n5                      3                 4                 3                3   \n...                  ...               ...               ...              ...   \n103909                 5                 5                 5                5   \n103910                 1                 5                 1                4   \n103911                 4                 4                 4                4   \n103912                 4                 4                 4                5   \n103913                 5                 5                 5                5   \n\n        Cleanliness  Online boarding  Departure Delay in Minutes  \\\nid                                                                 \n1                 5                5                           2   \n2                 5                5                          26   \n3                 3                5                           0   \n4                 5                4                           0   \n5                 3                5                           0   \n...             ...              ...                         ...   \n103909            5                3                          38   \n103910            1                3                           0   \n103911            4                5                          10   \n103912            4                5                           4   \n103913            5                5                           0   \n\n        Arrival Delay in Minutes  \nid                                \n1                            5.0  \n2                           39.0  \n3                            0.0  \n4                            0.0  \n5                            1.0  \n...                          ...  \n103909                      66.0  \n103910                       0.0  \n103911                      10.0  \n103912                       6.0  \n103913                       0.0  \n\n[103589 rows x 23 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>satisfaction_v2</th>\n      <th>Gender</th>\n      <th>Customer Type</th>\n      <th>Age</th>\n      <th>Type of Travel</th>\n      <th>Class</th>\n      <th>Flight Distance</th>\n      <th>Seat comfort</th>\n      <th>Departure/Arrival time convenient</th>\n      <th>Food and drink</th>\n      <th>...</th>\n      <th>Online support</th>\n      <th>Ease of Online booking</th>\n      <th>On-board service</th>\n      <th>Leg room service</th>\n      <th>Baggage handling</th>\n      <th>Checkin service</th>\n      <th>Cleanliness</th>\n      <th>Online boarding</th>\n      <th>Departure Delay in Minutes</th>\n      <th>Arrival Delay in Minutes</th>\n    </tr>\n    <tr>\n      <th>id</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>1</th>\n      <td>0</td>\n      <td>0</td>\n      <td>0</td>\n      <td>48</td>\n      <td>0</td>\n      <td>0</td>\n      <td>2211</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>2</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>2</td>\n      <td>5.0</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>35</td>\n      <td>0</td>\n      <td>0</td>\n      <td>840</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>5</td>\n      <td>26</td>\n      <td>39.0</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>41</td>\n      <td>0</td>\n      <td>0</td>\n      <td>879</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>5</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>50</td>\n      <td>0</td>\n      <td>0</td>\n      <td>1932</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>4</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>49</td>\n      <td>0</td>\n      <td>0</td>\n      <td>3512</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>1.0</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>103909</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>51</td>\n      <td>0</td>\n      <td>0</td>\n      <td>63</td>\n      <td>0</td>\n      <td>5</td>\n      <td>0</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>38</td>\n      <td>66.0</td>\n    </tr>\n    <tr>\n      <th>103910</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>54</td>\n      <td>1</td>\n      <td>0</td>\n      <td>540</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>...</td>\n      <td>5</td>\n      <td>1</td>\n      <td>1</td>\n      <td>5</td>\n      <td>1</td>\n      <td>4</td>\n      <td>1</td>\n      <td>3</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>103911</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>52</td>\n      <td>0</td>\n      <td>0</td>\n      <td>2328</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>10</td>\n      <td>10.0</td>\n    </tr>\n    <tr>\n      <th>103912</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>51</td>\n      <td>0</td>\n      <td>0</td>\n      <td>537</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>6.0</td>\n    </tr>\n    <tr>\n      <th>103913</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>53</td>\n      <td>0</td>\n      <td>0</td>\n      <td>804</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n  </tbody>\n</table>\n<p>103589 rows × 23 columns</p>\n</div>"
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_data"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.692757500Z",
     "start_time": "2023-06-11T11:12:30.844896900Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "outputs": [
    {
     "data": {
      "text/plain": "        satisfaction_v2  Gender  Customer Type  Age  Type of Travel  Class  \\\nid                                                                           \n103914                1       1              1   56               0      0   \n103915                0       0              1   55               0      1   \n103916                1       1              1   52               0      0   \n103917                1       1              1   51               0      0   \n103918                1       1              1   49               0      0   \n...                 ...     ...            ...  ...             ...    ...   \n116892                1       1              1   51               0      0   \n116893                1       0              1   59               0      0   \n116894                1       1              1   48               0      0   \n116895                1       0              1   41               0      0   \n116896                0       0              1   42               0      1   \n\n        Flight Distance  Seat comfort  Departure/Arrival time convenient  \\\nid                                                                         \n103914              548             3                                  3   \n103915             1363             2                                  3   \n103916              537             3                                  3   \n103917              787             1                                  1   \n103918             3111             5                                  5   \n...                 ...           ...                                ...   \n116892             3892             0                                  0   \n116893             2355             2                                  2   \n116894              420             5                                  5   \n116895             3189             0                                  0   \n116896             1271             3                                  5   \n\n        Food and drink  ...  Online support  Ease of Online booking  \\\nid                      ...                                           \n103914               3  ...               4                       4   \n103915               2  ...               2                       2   \n103916               3  ...               4                       5   \n103917               1  ...               5                       2   \n103918               2  ...               4                       5   \n...                ...  ...             ...                     ...   \n116892               0  ...               5                       5   \n116893               2  ...               5                       4   \n116894               5  ...               5                       5   \n116895               0  ...               4                       5   \n116896               5  ...               3                       3   \n\n        On-board service  Leg room service  Baggage handling  Checkin service  \\\nid                                                                              \n103914                 4                 4                 4                4   \n103915                 1                 1                 3                1   \n103916                 5                 5                 5                3   \n103917                 2                 3                 2                3   \n103918                 5                 5                 5                5   \n...                  ...               ...               ...              ...   \n116892                 5                 5                 5                3   \n116893                 4                 4                 4                3   \n116894                 5                 4                 5                4   \n116895                 5                 5                 5                4   \n116896                 4                 4                 4                3   \n\n        Cleanliness  Online boarding  Departure Delay in Minutes  \\\nid                                                                 \n103914            4                5                          43   \n103915            4                2                           0   \n103916            5                4                           7   \n103917            2                4                          25   \n103918            5                4                          24   \n...             ...              ...                         ...   \n116892            5                5                           9   \n116893            4                4                           0   \n116894            5                5                          26   \n116895            5                5                          10   \n116896            4                3                          48   \n\n        Arrival Delay in Minutes  \nid                                \n103914                      46.0  \n103915                       0.0  \n103916                       3.0  \n103917                      24.0  \n103918                      24.0  \n...                          ...  \n116892                       0.0  \n116893                       0.0  \n116894                      18.0  \n116895                       5.0  \n116896                      38.0  \n\n[12949 rows x 23 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>satisfaction_v2</th>\n      <th>Gender</th>\n      <th>Customer Type</th>\n      <th>Age</th>\n      <th>Type of Travel</th>\n      <th>Class</th>\n      <th>Flight Distance</th>\n      <th>Seat comfort</th>\n      <th>Departure/Arrival time convenient</th>\n      <th>Food and drink</th>\n      <th>...</th>\n      <th>Online support</th>\n      <th>Ease of Online booking</th>\n      <th>On-board service</th>\n      <th>Leg room service</th>\n      <th>Baggage handling</th>\n      <th>Checkin service</th>\n      <th>Cleanliness</th>\n      <th>Online boarding</th>\n      <th>Departure Delay in Minutes</th>\n      <th>Arrival Delay in Minutes</th>\n    </tr>\n    <tr>\n      <th>id</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>103914</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>56</td>\n      <td>0</td>\n      <td>0</td>\n      <td>548</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>43</td>\n      <td>46.0</td>\n    </tr>\n    <tr>\n      <th>103915</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>55</td>\n      <td>0</td>\n      <td>1</td>\n      <td>1363</td>\n      <td>2</td>\n      <td>3</td>\n      <td>2</td>\n      <td>...</td>\n      <td>2</td>\n      <td>2</td>\n      <td>1</td>\n      <td>1</td>\n      <td>3</td>\n      <td>1</td>\n      <td>4</td>\n      <td>2</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>103916</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>52</td>\n      <td>0</td>\n      <td>0</td>\n      <td>537</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>4</td>\n      <td>7</td>\n      <td>3.0</td>\n    </tr>\n    <tr>\n      <th>103917</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>51</td>\n      <td>0</td>\n      <td>0</td>\n      <td>787</td>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>...</td>\n      <td>5</td>\n      <td>2</td>\n      <td>2</td>\n      <td>3</td>\n      <td>2</td>\n      <td>3</td>\n      <td>2</td>\n      <td>4</td>\n      <td>25</td>\n      <td>24.0</td>\n    </tr>\n    <tr>\n      <th>103918</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>49</td>\n      <td>0</td>\n      <td>0</td>\n      <td>3111</td>\n      <td>5</td>\n      <td>5</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>24</td>\n      <td>24.0</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>116892</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>51</td>\n      <td>0</td>\n      <td>0</td>\n      <td>3892</td>\n      <td>0</td>\n      <td>0</td>\n      <td>0</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>5</td>\n      <td>9</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>116893</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>59</td>\n      <td>0</td>\n      <td>0</td>\n      <td>2355</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>3</td>\n      <td>4</td>\n      <td>4</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>116894</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>48</td>\n      <td>0</td>\n      <td>0</td>\n      <td>420</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>26</td>\n      <td>18.0</td>\n    </tr>\n    <tr>\n      <th>116895</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>41</td>\n      <td>0</td>\n      <td>0</td>\n      <td>3189</td>\n      <td>0</td>\n      <td>0</td>\n      <td>0</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>10</td>\n      <td>5.0</td>\n    </tr>\n    <tr>\n      <th>116896</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>42</td>\n      <td>0</td>\n      <td>1</td>\n      <td>1271</td>\n      <td>3</td>\n      <td>5</td>\n      <td>5</td>\n      <td>...</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>48</td>\n      <td>38.0</td>\n    </tr>\n  </tbody>\n</table>\n<p>12949 rows × 23 columns</p>\n</div>"
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_data"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.693755100Z",
     "start_time": "2023-06-11T11:12:30.890774700Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "outputs": [
    {
     "data": {
      "text/plain": "        satisfaction_v2  Gender  Customer Type  Age  Type of Travel  Class  \\\nid                                                                           \n116897                1       1              1   33               0      0   \n116898                1       1              1   53               0      0   \n116899                0       1              0   25               0      1   \n116900                1       0              1   40               0      0   \n116901                1       0              0   48               0      0   \n...                 ...     ...            ...  ...             ...    ...   \n129876                0       0              1   28               1      2   \n129877                0       0              1   41               1      2   \n129878                0       0              1   42               1      2   \n129879                1       0              1   50               1      2   \n129880                1       1              1   20               1      2   \n\n        Flight Distance  Seat comfort  Departure/Arrival time convenient  \\\nid                                                                         \n116897              369             0                                  0   \n116898              692             1                                  1   \n116899             1753             2                                  3   \n116900              416             5                                  4   \n116901             1779             3                                  4   \n...                 ...           ...                                ...   \n129876             1916             4                                  4   \n129877             1483             3                                  5   \n129878             1685             2                                  5   \n129879             1229             5                                  4   \n129880             1472             3                                  3   \n\n        Food and drink  ...  Online support  Ease of Online booking  \\\nid                      ...                                           \n116897               0  ...               4                       2   \n116898               1  ...               5                       4   \n116899               2  ...               4                       4   \n116900               5  ...               5                       5   \n116901               4  ...               1                       1   \n...                ...  ...             ...                     ...   \n129876               4  ...               1                       4   \n129877               3  ...               2                       2   \n129878               2  ...               3                       3   \n129879               4  ...               4                       3   \n129880               3  ...               5                       4   \n\n        On-board service  Leg room service  Baggage handling  Checkin service  \\\nid                                                                              \n116897                 2                 2                 2                3   \n116898                 4                 4                 4                5   \n116899                 1                 4                 4                2   \n116900                 5                 5                 5                4   \n116901                 5                 4                 4                4   \n...                  ...               ...               ...              ...   \n129876                 5                 4                 4                4   \n129877                 5                 5                 5                5   \n129878                 3                 4                 5                4   \n129879                 4                 5                 5                3   \n129880                 4                 5                 4                5   \n\n        Cleanliness  Online boarding  Departure Delay in Minutes  \\\nid                                                                 \n116897            2                3                          41   \n116898            4                5                           0   \n116899            3                4                           0   \n116900            5                3                           0   \n116901            4                1                           2   \n...             ...              ...                         ...   \n129876            5                4                           2   \n129877            4                2                           0   \n129878            4                3                           6   \n129879            4                3                          31   \n129880            4                3                           0   \n\n        Arrival Delay in Minutes  \nid                                \n116897                      55.0  \n116898                       0.0  \n116899                       0.0  \n116900                       0.0  \n116901                       0.0  \n...                          ...  \n129876                       3.0  \n129877                       0.0  \n129878                      14.0  \n129879                      22.0  \n129880                       0.0  \n\n[12949 rows x 23 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>satisfaction_v2</th>\n      <th>Gender</th>\n      <th>Customer Type</th>\n      <th>Age</th>\n      <th>Type of Travel</th>\n      <th>Class</th>\n      <th>Flight Distance</th>\n      <th>Seat comfort</th>\n      <th>Departure/Arrival time convenient</th>\n      <th>Food and drink</th>\n      <th>...</th>\n      <th>Online support</th>\n      <th>Ease of Online booking</th>\n      <th>On-board service</th>\n      <th>Leg room service</th>\n      <th>Baggage handling</th>\n      <th>Checkin service</th>\n      <th>Cleanliness</th>\n      <th>Online boarding</th>\n      <th>Departure Delay in Minutes</th>\n      <th>Arrival Delay in Minutes</th>\n    </tr>\n    <tr>\n      <th>id</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>116897</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>33</td>\n      <td>0</td>\n      <td>0</td>\n      <td>369</td>\n      <td>0</td>\n      <td>0</td>\n      <td>0</td>\n      <td>...</td>\n      <td>4</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>3</td>\n      <td>2</td>\n      <td>3</td>\n      <td>41</td>\n      <td>55.0</td>\n    </tr>\n    <tr>\n      <th>116898</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>53</td>\n      <td>0</td>\n      <td>0</td>\n      <td>692</td>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>...</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>116899</th>\n      <td>0</td>\n      <td>1</td>\n      <td>0</td>\n      <td>25</td>\n      <td>0</td>\n      <td>1</td>\n      <td>1753</td>\n      <td>2</td>\n      <td>3</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>4</td>\n      <td>1</td>\n      <td>4</td>\n      <td>4</td>\n      <td>2</td>\n      <td>3</td>\n      <td>4</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>116900</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>40</td>\n      <td>0</td>\n      <td>0</td>\n      <td>416</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>3</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>116901</th>\n      <td>1</td>\n      <td>0</td>\n      <td>0</td>\n      <td>48</td>\n      <td>0</td>\n      <td>0</td>\n      <td>1779</td>\n      <td>3</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>1</td>\n      <td>1</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>1</td>\n      <td>2</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>129876</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>28</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1916</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>1</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>2</td>\n      <td>3.0</td>\n    </tr>\n    <tr>\n      <th>129877</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>41</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1483</td>\n      <td>3</td>\n      <td>5</td>\n      <td>3</td>\n      <td>...</td>\n      <td>2</td>\n      <td>2</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>4</td>\n      <td>2</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>129878</th>\n      <td>0</td>\n      <td>0</td>\n      <td>1</td>\n      <td>42</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1685</td>\n      <td>2</td>\n      <td>5</td>\n      <td>2</td>\n      <td>...</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>3</td>\n      <td>6</td>\n      <td>14.0</td>\n    </tr>\n    <tr>\n      <th>129879</th>\n      <td>1</td>\n      <td>0</td>\n      <td>1</td>\n      <td>50</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1229</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>31</td>\n      <td>22.0</td>\n    </tr>\n    <tr>\n      <th>129880</th>\n      <td>1</td>\n      <td>1</td>\n      <td>1</td>\n      <td>20</td>\n      <td>1</td>\n      <td>2</td>\n      <td>1472</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>3</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n  </tbody>\n</table>\n<p>12949 rows × 23 columns</p>\n</div>"
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dev_data"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.693755100Z",
     "start_time": "2023-06-11T11:12:30.922689600Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "outputs": [],
   "source": [
    "train_y=train_data[\"satisfaction_v2\"]\n",
    "train_x=train_data.drop(labels='satisfaction_v2', axis=1)\n",
    "test_y=test_data[\"satisfaction_v2\"]\n",
    "test_x=test_data.drop(labels='satisfaction_v2', axis=1)\n",
    "dev_y=dev_data[\"satisfaction_v2\"]\n",
    "dev_x=dev_data.drop(labels='satisfaction_v2', axis=1)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.693755100Z",
     "start_time": "2023-06-11T11:12:30.953605500Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "outputs": [
    {
     "data": {
      "text/plain": "        Gender  Customer Type  Age  Type of Travel  Class  Flight Distance  \\\nid                                                                           \n1            0              0   48               0      0             2211   \n2            1              1   35               0      0              840   \n3            0              1   41               0      0              879   \n4            0              1   50               0      0             1932   \n5            1              1   49               0      0             3512   \n...        ...            ...  ...             ...    ...              ...   \n103909       1              1   51               0      0               63   \n103910       0              1   54               1      0              540   \n103911       0              1   52               0      0             2328   \n103912       0              1   51               0      0              537   \n103913       0              1   53               0      0              804   \n\n        Seat comfort  Departure/Arrival time convenient  Food and drink  \\\nid                                                                        \n1                  3                                  3               3   \n2                  2                                  2               2   \n3                  4                                  4               4   \n4                  2                                  2               2   \n5                  3                                  3               3   \n...              ...                                ...             ...   \n103909             0                                  5               0   \n103910             5                                  4               5   \n103911             4                                  4               4   \n103912             3                                  3               3   \n103913             2                                  2               2   \n\n        Gate location  ...  Online support  Ease of Online booking  \\\nid                     ...                                           \n1                   3  ...               5                       5   \n2                   2  ...               4                       5   \n3                   4  ...               5                       3   \n4                   2  ...               5                       5   \n5                   3  ...               4                       3   \n...               ...  ...             ...                     ...   \n103909              5  ...               5                       5   \n103910              4  ...               5                       1   \n103911              4  ...               4                       4   \n103912              3  ...               4                       4   \n103913              2  ...               4                       5   \n\n        On-board service  Leg room service  Baggage handling  Checkin service  \\\nid                                                                              \n1                      3                 2                 5                4   \n2                      5                 5                 5                3   \n3                      3                 3                 3                4   \n4                      5                 5                 5                3   \n5                      3                 4                 3                3   \n...                  ...               ...               ...              ...   \n103909                 5                 5                 5                5   \n103910                 1                 5                 1                4   \n103911                 4                 4                 4                4   \n103912                 4                 4                 4                5   \n103913                 5                 5                 5                5   \n\n        Cleanliness  Online boarding  Departure Delay in Minutes  \\\nid                                                                 \n1                 5                5                           2   \n2                 5                5                          26   \n3                 3                5                           0   \n4                 5                4                           0   \n5                 3                5                           0   \n...             ...              ...                         ...   \n103909            5                3                          38   \n103910            1                3                           0   \n103911            4                5                          10   \n103912            4                5                           4   \n103913            5                5                           0   \n\n        Arrival Delay in Minutes  \nid                                \n1                            5.0  \n2                           39.0  \n3                            0.0  \n4                            0.0  \n5                            1.0  \n...                          ...  \n103909                      66.0  \n103910                       0.0  \n103911                      10.0  \n103912                       6.0  \n103913                       0.0  \n\n[103589 rows x 22 columns]",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Gender</th>\n      <th>Customer Type</th>\n      <th>Age</th>\n      <th>Type of Travel</th>\n      <th>Class</th>\n      <th>Flight Distance</th>\n      <th>Seat comfort</th>\n      <th>Departure/Arrival time convenient</th>\n      <th>Food and drink</th>\n      <th>Gate location</th>\n      <th>...</th>\n      <th>Online support</th>\n      <th>Ease of Online booking</th>\n      <th>On-board service</th>\n      <th>Leg room service</th>\n      <th>Baggage handling</th>\n      <th>Checkin service</th>\n      <th>Cleanliness</th>\n      <th>Online boarding</th>\n      <th>Departure Delay in Minutes</th>\n      <th>Arrival Delay in Minutes</th>\n    </tr>\n    <tr>\n      <th>id</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>1</th>\n      <td>0</td>\n      <td>0</td>\n      <td>48</td>\n      <td>0</td>\n      <td>0</td>\n      <td>2211</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>2</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>2</td>\n      <td>5.0</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>1</td>\n      <td>1</td>\n      <td>35</td>\n      <td>0</td>\n      <td>0</td>\n      <td>840</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>5</td>\n      <td>26</td>\n      <td>39.0</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>0</td>\n      <td>1</td>\n      <td>41</td>\n      <td>0</td>\n      <td>0</td>\n      <td>879</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>5</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>0</td>\n      <td>1</td>\n      <td>50</td>\n      <td>0</td>\n      <td>0</td>\n      <td>1932</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>5</td>\n      <td>4</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>1</td>\n      <td>1</td>\n      <td>49</td>\n      <td>0</td>\n      <td>0</td>\n      <td>3512</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>4</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>5</td>\n      <td>0</td>\n      <td>1.0</td>\n    </tr>\n    <tr>\n      <th>...</th>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n      <td>...</td>\n    </tr>\n    <tr>\n      <th>103909</th>\n      <td>1</td>\n      <td>1</td>\n      <td>51</td>\n      <td>0</td>\n      <td>0</td>\n      <td>63</td>\n      <td>0</td>\n      <td>5</td>\n      <td>0</td>\n      <td>5</td>\n      <td>...</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>3</td>\n      <td>38</td>\n      <td>66.0</td>\n    </tr>\n    <tr>\n      <th>103910</th>\n      <td>0</td>\n      <td>1</td>\n      <td>54</td>\n      <td>1</td>\n      <td>0</td>\n      <td>540</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>...</td>\n      <td>5</td>\n      <td>1</td>\n      <td>1</td>\n      <td>5</td>\n      <td>1</td>\n      <td>4</td>\n      <td>1</td>\n      <td>3</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n    <tr>\n      <th>103911</th>\n      <td>0</td>\n      <td>1</td>\n      <td>52</td>\n      <td>0</td>\n      <td>0</td>\n      <td>2328</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>...</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>10</td>\n      <td>10.0</td>\n    </tr>\n    <tr>\n      <th>103912</th>\n      <td>0</td>\n      <td>1</td>\n      <td>51</td>\n      <td>0</td>\n      <td>0</td>\n      <td>537</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>3</td>\n      <td>...</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>5</td>\n      <td>4</td>\n      <td>6.0</td>\n    </tr>\n    <tr>\n      <th>103913</th>\n      <td>0</td>\n      <td>1</td>\n      <td>53</td>\n      <td>0</td>\n      <td>0</td>\n      <td>804</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>2</td>\n      <td>...</td>\n      <td>4</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>5</td>\n      <td>0</td>\n      <td>0.0</td>\n    </tr>\n  </tbody>\n</table>\n<p>103589 rows × 22 columns</p>\n</div>"
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_x"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.693755100Z",
     "start_time": "2023-06-11T11:12:30.969564400Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "outputs": [
    {
     "data": {
      "text/plain": "id\n1         0\n2         1\n3         1\n4         1\n5         1\n         ..\n103909    1\n103910    1\n103911    1\n103912    1\n103913    1\nName: satisfaction_v2, Length: 103589, dtype: int64"
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_y"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2023-06-11T11:12:35.694751700Z",
     "start_time": "2023-06-11T11:12:31.017435200Z"
    }
   }
  }
 ],
 "metadata": {
  "kernelspec": {
   "name": "ml-20230610",
   "language": "python",
   "display_name": "ml-20230610"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
